Random CapsNet forest model for imbalanced malware type classification task
نویسندگان
چکیده
Abstract Behavior of malware varies depending the types, which affects strategies system protection software. Many classification models, empowered by machine and/or deep learning, achieve superior accuracies for predicting types. Machine learning-based models need to do heavy feature engineering work, performance greatly. On other hand, require less effort in when compared that models. However, traditional learning architectures components, such as max and average pooling, cause architecture be more complex sensitive data. The capsule network architectures, on reduce aforementioned complexities eliminating pooling components. Additionally, based are data, unlike classical convolutional neural architectures. This paper proposes an ensemble model bootstrap aggregating technique. proposed method is tested two widely used, highly imbalanced datasets (Malimg BIG2015), the-state-of-the-art results well-known can used comparison purposes. achieves highest F-Score, 0.9820, BIG2015 dataset 0.9661, Malimg dataset. Our also reaches the-state-of-the-art, using 99.7% lower number trainable parameters than best literature.
منابع مشابه
Random Forest for Malware Classification
The challenge in engaging malware activities involves the correct identification and classification of different malware variants. Various malwares incorporate code obfuscation methods that alters their code signatures effectively countering antimalware detection techniques utilizing static methods and signature database. In this study, we utilized an approach of converting a malware binary int...
متن کاملRandom Forest Classification for Android Malware
Classification techniques such as Support Vector Machines, K-Nearest Neighbours, Decision Trees, Logistic Regression and Naive Bayes have widely been used in the area of intrusion detection research in the security community. They are predominantly used for behaviour based detection methods (anomaly detection methods). In this paper we exclusively apply the ensemble learning algorithm Random Fo...
متن کاملRandom Forest Based Imbalanced Data Cleaning and Classification
The given task of PAKDD 2007 data mining competition is a typical problem of learning from extremely imbalanced data set. In this paper, we propose a combination of random forest based techniques and sampling methods to identify the potential buyers. Our methods is mainly composed of two phases: data cleaning and classification, both based on random forest. Firstly, the data set is cleaned by t...
متن کاملRandom Projection Method for Scalable Malware Classification
In this poster a new approach for scalable behavioral based malware classification is presented. It is based on the random projection method which is an efficient, effective yet simple dimensionality reduction method. Interestingly, however, the random projection method has not – to the authors’ best knowledge – ever been investigated for its possible usefulness for the malware classification p...
متن کاملUsing Random Forest to Learn Imbalanced Data
In this paper we propose two ways to deal with the imbalanced data classification problem using random forest. One is based on cost sensitive learning, and the other is based on a sampling technique. Performance metrics such as precision and recall, false positive rate and false negative rate, F-measure and weighted accuracy are computed. Both methods are shown to improve the prediction accurac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computers & Security
سال: 2021
ISSN: ['0167-4048', '1872-6208']
DOI: https://doi.org/10.1016/j.cose.2020.102133